Contextual Visual Similarity

نویسندگان

  • Xiaofang Wang
  • Kris M. Kitani
  • Martial Hebert
چکیده

Measuring visual similarity is critical for image understanding. But what makes two images similar? Most existing work on visual similarity assumes that images are similar because they contain the same object instance or category. However, the reason why images are similar is much more complex. For example, from the perspective of category, a black dog image is similar to a white dog image. However, in terms of color, a black dog image is more similar to a black horse image than the white dog image. This example serves to illustrate that visual similarity is ambiguous but can be made precise when given an explicit contextual perspective. Based on this observation, we propose the concept of contextual visual similarity. To be concrete, we examine the concept of contextual visual similarity in the application domain of image search. Instead of providing only a single image for image similarity search (e.g., Google image search), we require three images. Given a query image, a second positive image and a third negative image, dissimilar to the first two images, we define a contextualized similarity search criteria. In particular, we learn feature weights over all the feature dimensions of each image such that the distance between the query image and the positive image is small and their distances to the negative image are large after reweighting their features. The learned feature weights encode the contextualized visual similarity specified by the user and can be used for attribute specific image search. We also show the usefulness of our contextualized similarity weighting scheme for different tasks, such as answering visual analogy questions and unsupervised attribute discovery.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The interaction of contextual constraints and parafoveal visual information in reading.

An experiment is reported which demonstrates that contextual constraints and parafoveal visual information interact during reading. As subjects read sentences, the parafoveal visual information available from a target area in the sentence was varied: the parafoveal information was either visually similar or dissimilar to a target word the subject would later fixate. The visual similarity of the...

متن کامل

Key Phrase Based - Graph Representation for Contextual Similarity Between Documents

Finding similarity between documents which have no common key words has not received much attention till now. Here we develop a graph based representation for finding contextual similarity between documents which are totally disjoint in terms of its keywords. For this a bi-grams based key phrase approach is designed. Different algorithms for pairwise similarity were studied and evolved to suit ...

متن کامل

Learning Warps Object Representations in the Ventral Temporal Cortex

The human ventral temporal cortex (VTC) plays a critical role in object recognition. Although it is well established that visual experience shapes VTC object representations, the impact of semantic and contextual learning is unclear. In this study, we tracked changes in representations of novel visual objects that emerged after learning meaningful information about each object. Over multiple tr...

متن کامل

Fusing Contextual Metadata and Visual Similarity in Mobile Media Location-Based Classification

This paper describes a new approach to the automatic identification of the location of mobile images. New methods for determining image similarity are combined with analysis of automatically acquired contextual metadata to produce location information. Results are reported on a database of 1209 real images collected on Nokia 7610 camera phones by different users in 12 different locations across...

متن کامل

Exploiting Contextual Information for Image Re-ranking

asically, given a query image, a CBIR system aims at retrieving the most Bsimilar images in a collection by taking into account image visual properties (such as, shape, color, and texture). Collection images are ranked in decreasing order of similarity, according to a given image descriptor. However, in general, these approaches perform only and compute similarity (or distance) measures conside...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1612.02534  شماره 

صفحات  -

تاریخ انتشار 2016